# INT4 Quantized Inference
Deepseek R1 0528 Quantized.w4a16
MIT
The DeepSeek-R1-0528 model after quantization processing significantly reduces the requirements for GPU memory and disk space by quantizing the weights to the INT4 data type.
Large Language Model
Safetensors
D
RedHatAI
126
3
Qwen2.5 VL 3B Instruct Quantized.w4a16
Apache-2.0
The quantized version of Qwen2.5-VL-3B-Instruct, with weights quantized to INT4 and activations quantized to FP16, designed for efficient vision-text task inference.
Text-to-Image
Transformers English

Q
RedHatAI
167
1
Qwen2.5 VL 7B Instruct Quantized.w4a16
Apache-2.0
Quantized version of Qwen2.5-VL-7B-Instruct, supporting vision-text input and text output, with weights quantized to INT4 and activations to FP16.
Text-to-Image
Transformers English

Q
RedHatAI
605
3
Featured Recommended AI Models